# ReconOS: Extending OS Services Over FPGAs

Marco Platzner Computer Engineering Group University of Paderborn

marco.platzner@upb.de



## **Reconfigurable Hardware Operating Systems**

### Introduce a new layer of abstraction

- u turn hardware accelerators into hardware tasks (threads)
- u rely on an operating system to schedule, place, and execute these tasks

### Motivation

- u increase productivity and portability
- u exploit partial reconfigurability
- u use reconfigurable hardware for dynamic task sets
- Operating system services
  - u task management
    - load/remove/preempt/resume
    - communication, synchronization
    - scheduling
  - u resource management
  - u time management



## ReconOS

- Main goal: extend the multithreaded programming model to reconfigurable hardware
  - u threads communicate and synchronize using programming model primitives, e.g., semaphores, mutexes, mailboxes, shared memory
  - u established model in software-based systems (e.g., POSIX pthreads)



### **Hardware Threads**

- A hardware thread consists of two parts
  - u OS synchronization finite state machine
  - u user logic
- A hardware thread is connected to the
  - u OS on the main CPU via the OSIF
  - u main memory via the MEMIF



## **Hardware Threads**

 Function library (VHDL) for implementing the OS synchronization FSM

```
OSFSM: process (clk, reset)
1
2
      variable ack: boolean;
3
     begin
4
5
      if reset = 1' then
6
        state <= GET DATA;
7
        run <= '0';
        osif reset (o_osif , i_osif);
8
9
        memif reset (o memif, i memif);
10
       elsif rising edge (clk) then
11
12
        case state is
13
14
          when GET DATA =>
15
                                                              -- receive new packet
            mbox get (o osif, i osif, MB IN, data in, done);
16
            next state <= COMPUTE;
17
18
          when COMPUTE =>
19
            run <= '1';
                                                               -- process packet
20
            if ready = 1' then
21
              run <= '0';
22
             next state <= PUT DATA;
23
            end if;
24
25
          when PUT DATA =>
26
            mbox put (o osif, i osif, MB OUT, data out, done); -- send processed packet
27
            next state <= LOCK;
28
29
          when LOCK =>
30
            mutex lock (o osif, i osif, CNT MUTEX, done);
                                                              -- acquire lock
31
            next state <= READ;
32
33
          when READ =>
34
            read (o memif, i memif, addr, count, done);
35
            next state <= WRITE
36
37
          when WRITE =>
38
            write (o memif, i memif, addr, count + 1, done);
                                                               -- update counter
39
            next state <= UNLOCK;
40
          when UNLOCK =>
41
42
            mutex unlock (o osif, i osif, CNT MUTEX, done);
                                                              -- release lock
43
            next state <= GET DATA;
44
45
        end case;
46
47
        if done then state <= next_state; end if;
48
49
      end if;
50
   end process;
```

### **Delegate Threads**

- A SW delegate thread is associated with every hardware thread
  - u calls the OS kernel on behalf of the hardware thread



## **Example ReconOS Architecture**



- Hardware threads can be loaded / removed by partial reconfiguration
- Hardware threads use cooperative multitasking

## **ReconOS Toolflow**



### **ReconOS Versions**

| - |               | eCos/PowerPC, Virtex-2Pro (XUPV2P), Virtex-2 (Erlangen<br>Slot Machine) and Virtex-4 (Avnet Virtex-4 PCIe Kit, ML403)          |
|---|---------------|--------------------------------------------------------------------------------------------------------------------------------|
| - | Version 2.0   | Linux, eCos / PowerPC, Virtex-2Pro (XUPV2P) and<br>Virtex-4 (ML403)<br>virtual memory support, FIFO interconnect               |
|   | — Version 3.0 | Linux, xilkernel / MicroBlaze, Virtex-6 (ML605)                                                                                |
| - | - Version 3.1 | Linux / ARM, Xilinx Zynq (Zedboard)                                                                                            |
|   | Version ?     | Linux / ARM, Xilinx Zynq (Zedboard)<br>Vivado HLS for hardware thread design,<br>direct communication between hardware threads |

### www.reconos.de



About ReconOS Getting Started Documentation Get Involved Publications



"The ReconOS operating system for reconfigurable computing offers a unified multithreaded programming model and OS services for threads executing in software and threads mapped to reconfigurable hardware. By semantically integrating hardware accelerators into a standard OS environment, ReconOS allows for rapid design-space exploration, supports a structured application development process, and improves the portability of applications between different reconfigurable computing systems."

ReconOS - an operating system approach for reconfigurable computing

### ¥

#### Multithreaded Programming Model

Easy to understand programming model based on hardware and software threads.

#### Active development and support

Many developers are working with ReconOS and form a community you want to join.

#### Extended and easy to use Toolchain

A complete and easy to use toolchain supports you while developing your ReconOS applications.





ReconOS - Operating System for Reconfigurable Computing Developed at the University of Paderborn - *Imprint* 

## **Experience with ReconOS (1)**

### **1.** ReconOS supports a step-by-step application design process



### 2. ReconOS facilitates design space exploration

- u Example: video object tracker
  - Virtex-4 FPGA (2 x PPC 405)
  - sw: all threads run in software
  - hw\*: a number of threads run in hardware
  - sw\*: a number of threads run on second (worker) CPU



### **3.** ReconOS enables (self-)adaptive systems

- Example: video object tracker u
  - performance in [7,10] fps
  - minimize number of cores



frame 5





frame 150



frame 260



## Why isn't ReconOS used more?

### ReconOS is an academic project

- good as playground for research ideas, but we have limited resources for making it easily usable for others
- u out of the box only a few platforms are supported
- u more tutorials and examples needed on the website
- ReconOS is (still) a complex environment
  - u requires understanding of platform FPGA architectures and tool flows
  - u requires some hardware design skills (for creating hardware threads)
- Performance more important than productivity / flexibility
  - u designers tend to optimize to the max, at the end they often have one big hardware thread and thrown away the OS abstractions
- The multi-threading model of ReconOS is obviously not suitable for all types of applications

## **OS Abstractions for Heterogeneous Nodes**

- Delegate threads, cooperative multitasking (like in ReconOS) for tasks on FPGA and GPU
- Allows for preemption and heterogeneous migration of tasks
  - based on a programming pattern with check-pointing and strip-mining





## **Scheduling for Heterogeneous Nodes: HETSCHED**

### Experiment

- u sets of 32 tasks: Heat Distribution, Correlation Matrix, Gauss Blur, Markov Chain
- u all tasks implemented on CPU, FPGA, GPU
- u all schedulers are work-conserving



## **Summary: OS Services for FPGAs**

- ReconOS: multithreaded programming for software and hardware
- Heterogeneous node: preemption and heterogeneous migration

- Does ReconOS get software programmers on FPGAs?
- Which OS services are useful for FPGAs in ...
  - u embedded systems
  - u high-performance computing
  - u warehouse scale computers

